SBIR Based Screening for Lung Cancer
G. Mary Valantina, Z. Mary Livinsa
Department of ETCE, Sathyabama Institute of Science and Technology, Chennai, Tamilnadu, India.
*Corresponding Author E-mail: valantina78@gmail.com, livinsa@gmail.com
ABSTRACT:
Lung cancer is the most dangerous type of cancer in the world. Early detection can save the life and increase the survivability of patients. In this project we obtain a solution for lung cancer symptom detection by applying Shape based image retrieval (SBIR). Our algorithm is broadly divided into three parts, at first part we accept the data set of cancer symptoms which is a generalized way for creating the patterns for Lung Cancer Framework, and in the second part we find the relevant data from the patterns using segmentation approach. We can choose the frequent symptoms only by using the threshold value. Based on the threshold value we decide whether it’s a cancer cell or non-cancer cell. We initialize the cancer cell value to support the pattern of cancer symptoms. It is updated in each trial. By updating the cancer cell value in each step we can check the symptom precision which either increases the accuracy or decreases it. Finally, result analysis can be proved by the appropriately using artificial neural network algorithm.
KEYWORDS: Dicom, SVM, ANN, k-NN, Accuracy, Sensitivity, Precision and Specificity.
I. INTRODUCTION:
Based on the statistics provided by American Cancer society, lung cancer is the primary cause of cancer related death in United States [1]. In United States 222,520 patients were diagnosed with lung cancer in 2010 [2]. According to SEER 17 data, the age-adjusted occurrence of cancer in the lung and bronchus was 62.5 per 100,000 people per year, from 2003 to 2007. The five years survival of a patient diagnosed with lung cancer at later stage is only 10%-15% [3]. If pulmonary nodules are diagnosed at early stages then five years survival rate can be increased up to 65%-80% [4]. This proves that early detection and volumetric analysis of lung nodules is the crucial step in the war against lung cancer. A pulmonary nodule can be stated as approximately round opacity having a maximum diameter of less than 3 cm [5].
Depending on intensity, pulmonary nodules are classified into two types, namely Solid nodules and ground glass nodule. In this paper, the image retrieval provides a versatile means that of looking an image based on the description of the required image. The foremost acquainted cancer that happens sometimes for men and women is lung cancer. In keeping with the report that was issued by the American Cancer Society in the year 2003, lung cancer would report for concerning 13% of all malignant tumor analyzed and 28% for deaths that are caused by cancer. The survival rate for lung cancer analyzed in five years is simply 15%. If the disease is known whereas it’s still localized, this rate will increases to 49%. Still, only 15% of diagnosed lung cancers are at this primary stage. This cause the necessity of lung nodule detection [1] in chest Computer Tomography (CT) images in advance. There are various detection techniques, like the noninvasive biopsy technique that is widely used for early lung cancer detection. Some of them are highlighted. The common detection methods for carcinoma are classical imaging methods like chest radiography (film or digital) and computed tomography (CT).
Digital radiography provides higher contrast resolution with equal or higher spatial resolution when put next to classical radiography techniques. However, these techniques still do not give definitive information which will be utilized toward the first detection of tumors. Low-dose spiral/helical CT will be a promising modality for carcinoma screening. However, it is restricted to tiny peripheral lesions. Chain smokers develop tumors situated within the central airways, and as a result, other. Several publications projected completely different automated nodule recognition systems by image processing, and as well as, completely different methods for segmentation, feature extraction and classification. Computer Tomography (CT) has been observed as the most widely used imaging technique for early detection of carcinoma. There is a demand for automated methodology to form use of enormous quantity of information obtained CT images. Computer Aided Diagnosis (CAD) will be used coherently for early detection of carcinoma. This study presents a CAD system which may mechanically discover the lung cancer nodules with reduction in false positive rates. During this study, completely different image processing techniques are given initially in order to attain the lung region from the CT scan chest images. Then the segmentation is transferred with the assistance of clustering algorithm. Finally for automatic recognition of cancer nodules, Support Vector Machine (SVM) is employed that helps in higher classification of cancer nodules. The experiment is conducted for the projected technique by CT images. In many articles, content based access to medical pictures for helping clinical decision-making has been projected that might ease the management of clinical data.
II. RELATED WORK:
One of the earliest CBIR systems that deals with lung CT images is the ASSERT project at Purdue University, which was first published in 1999. It investigated image characteristics such as co- 3 occurrence statistics, shape descriptors, Fourier transforms and global gray level statistics. Kawata et al. [6] created a CBIR system on lung nodule in 2004 considering shape descriptors and density histograms for grouping and obtaining a 3-D lung CT volumes but precision and recall of this CBIR system was not reported. Disney et al. [7] in 2007 proposed an open source pulmonary nodule image retrieval framework from chest CT images named as BRISC. They took the help of gray-level co-occurrence features, Haralick feature, Gabor filter and achieved precision of 88% when a single nodule is obtained. Automated nodule segmentation platform and shape based feature of nodule were not explored in their studies. Various parts of designing nodule CBIR system are preprocessing, nodule detection, false positive reduction, nodule segmentation, characteristics obtaining, similarity measure and retrieval of similar nodules. The study was performed by acquiring the heat patterns emitted by the object. Similarly, S. Katsuragawa et al. [8] developed a CMOS image sensor for the detection of incoming light rays using polarization information. It was shown that the polarimetric information like DOP, electric field vector intensities and some Stokes parameters like ellipticity and azimuthal angle can be used as a reference source for navigation with little complexity retrieval. In [9], the authors affirm that an Adaptive Median filtering is required to correct the poor contrast that occurs due to poor lighting conditions during image acquisition. They provided a low frequency image by converting every pixel value with a median pixel value computed over a square area of 5x5 pixels. Then, a contrast limited adaptive histogram (CLAHE) equalization method is used to improve the contrast of the CT pre-processed image. At the same time, Farag et al. insist in [10] that the filtering approach to use must preserve object boundaries and brief structures, Sharpen the discontinuities to enhance morphological structures and efficiently remove noise in homogeneous physical areas. In their work, the authors used both the Wiener and anisotropic diffusion filters.
Recently, various filters have been developed to enhance lung structures in 3-D images. Many researchers employed filters depending on eigen values of the Hessian matrix [11]. Frangi et al. [12] further developed this approach by introducing a 3D multi-scale structure enhancement filter depending on the eigen values of the Hessian matrix and applying it to the enhancement of vessels. Later, E.M.van Rikxoort et al. the first one to propose the supervised enhancement applied on single phase and multi-phase techniques [13]. In [14], the authors applied a set of 3D morphologic filters to differentiate the nodule from other surroundings structures, like vessels and bronchi. Segmentation of the lung regions is the next stage of the methods processing scheme. It refers to the process of partitioning the pre-processed CT image into various layers to separate the pixels corresponding to lung tissue from the surrounding anatomy scheme on two CT data sets. An efficient lung nodule extraction scheme with accuracy of 80.36% in [14] is developed by conducting nodule segmentation by using weighted fuzzy probabilistic technique in [15] clustering is carried out for lung cancer images.
III. PROPOSED METHOD:
The proposed SBIR framework is shown in the following figure (Figure.1). The database, where the CT scan images are stored is known as the Image database. In preprocessing, the images are sharpened, classified, and grouped in order for convenience for performing further processing works. The proposed model is done by collection of feature extraction methods of texture and gray scale resolution. Then this form of feature collection is grouped as a single feature vector and is stored in feature database. When the user inputs a query image, the same methods (pre-processing, feature extraction steps) are processed as in the offline image database process for obtaining the feature vector value of query image. Then this query image feature vector value will be related with feature vector value of the feature database. Based on the result, images that are closely similar to the query image are retrieved from the databases and displayed
The initial stage of the proposed Computer Aided Diagnosing (CAD) (Wiemker et al., 2003; Wiemkeret al., 2002) method is obtaining and highlighting the lung region from the CT scan image. The basic image processing techniques are utilized for this cause.
Fig 1: shape based image retrieval system as a diagnosis
A. Pre processing Enhancement:
The lung image will have a significant clarity with post processing enhancement for detecting the nodules. Many researchers employed filters using eigen values of the Hessian matrix [15][16]. Frangi et al [17] further developed this approach by defining a 3D compound structure enhancement filter based on the eigen values of the Hessian matrix and applying it to the enhancement of vessels. The various steps that have evolved in enhancement after segmentation are given below:
1. Small objects are excluded by morphological opening that is seen inside and outside the lungs in segmented image.
2. Next the borders enhancement and the empty spaces in the border is closed by morphological closing.
3. Morphological operation is followed by canny edge detection that helps to extract the boundary of the enhanced image.
4. Morphological thinning is then carried out with the boundary extracted image.
5. To obtain the last post-processed image morphological filling is done to remove unwanted muscle part from an image except the lungs. Figure 3 shows the post-processing enhancement technique elaborately.
B. Lung CT Image Segmentation:
Segmentation of an image consists of separation of the lung nodule from various parts of the CT scan image and then enhancement of the obtained image to get details. This process contains of series of steps that are shown.
1. The input image is changed into grayscale image and Non Local Mean filter helps to remove Gaussian white noise.
2. Otsu's threshold is applied to do segmentation of lung part from the CT image of lungs.
Figure 2 shows the original image, segmented image along with the background eliminated image
Fig 2: Segmentation: a) Source Image, b) Background removed Image, c) Threshold Image
C. Lung Nodule Feature Extraction and Classification:
The aim of Feature Extraction is to highlight the required characteristics of the nodule, and it is usually considered to be one of the critical problems of nodule prediction. Highlighting certain features that are vital for the nodule, but excludes the unimportant attributes is the way to structure the nodule. Hence the non-similar features for the lung nodule differentiation are considered and the feature vector thus formulated is FV ={F1, F2, F3, F4, F5, F6}.
Structural Feature:
Calculates the structural features value of nodule i.e. Area, Convex Hull Area, Equal Diameter and Solidity
AREA: It is a scalar value that accurately gives the number of pixels in the Region of Interest (ROI).
CONVEX AREA: It is a scalar value that provides the number of pixels in convex image of the Region of Interest (ROI) which is a binary image with all pixels that comes under the hull filled in.
EQUIV DIAMETER: It is the diameter of a circle with the equivalent area as the Region of Interest, (ROI) as defined in (2.1).
(2.1)
SOLIDITY: It is the proportion of the pixels in the convex hull that are available in the Region Of Interest as defined in (2.2).
Area
Solidity convex area (2.2)
Textural Feature:
It computes the structural feature worth of the nodule (i.e. energy, mean and variance). Energy is employed to explain live data in a picture, delineated in equation (2.3).
MEAN: The mean intensity number indicates the mean intensity value of all the pixels that belong to the region itself, is calculated by equation (2.4).
D. Algorithm of SBIR
· Initially its picks an arbitrary(r, c) pixel from the main region of image that is to be segmented. This pixel is known as seed pixel.
· Now test the nearest neighbor individually and the neighboring pixel is accepted to belong to the same region, if they satisfy the homogeneity property of a region by combined.
· Once a new pixel is accepted as a member of a current region, the nearest neighbor of this new pixel are examined.
· This process goes on recursively until no more pixels are accepted.
· All the pixels of current region are marked.
· Then another seed pixel is picked up and the same process is repeated
In [18], the authors applied a set of 3D morphologic filters to separate the nodule from other surroundings structures, such as vessels and bronchi. For classification purpose, the feature vector is given as input to classifier. SVM Classifier has three functions to perform classification. Choose information from database to coach classifier for two categories. Chosen feature input file is remodeled into a high dimensional area using nonlinear mapping, then next step searches for linear separating hyper plane within the new area. By the use of following steps, SVM classifier is trained for two categories. This classifier is then used for predicting the carcinoma at early stage or predicts the standing of patient.
IV. EXPERIMENTAL RESULTS:
The input set of the lung pictures considered are taken for National Lung Screening Trial (NLST) data/images of stage I and II. The amounting sample images taken for experimentation are 111 for stage-I and 73 samples are taken for stage-II lung cancer. Out of this four-fifth of the info helps for training and the rest of one-fifth is taken for checking the classifiers.24 pictures of stage I and 17 picture of stage II are in test dataset. Confusion matrix is shown in first table TP is 24, tells 24 images of stage I are predicted as stage I, FP is 2, means 2 images of stage II is expected to be in stage I, FN is 0, shows that no picture of stage I is expected to be in stage II. TN is 15, shows that 15 pictures of stage II are expected to be in stage II.
The experiments are conducted on the proposed computer-aided diagnosis systems with the help of lung images obtained from the website. This experimentation data consists of lung images. Those lung images are passed to the proposed CAD system. The diagnosis rules are then generated from those images and these rules are passed to the Shape Based Image Reterival (SBIR) for the learning process. After learning, a lung image is passed to the proposed CAD system. Then the proposed system will process through its processing steps and finally it will detect whether the supplied lung image is with cancer or not.
Then the planned system can process through its processing steps and it will detect whether the supplied lung image is with cancer or not at last. The results prove that there are few miss-detections however overall potency of vision based potency measure is more than 95%. The performance of CBIR system has provided in terms of precision and recall where precision is the ratio of relevant retrieved nodule to all retrieved nodule and recall is the ratio of relevant retrieved nodule to all relevant nodule present in the database.
Fig 3: (a) canny edge detection (b) Lung nodule is detected after segmentation.
Fig 4: (a) classification (b) disease detection
V. CONCLUSION:
The field of disease designation may be ceaselessly evolving and highly active field of analysis. The intention of this study was to predict the standing of patient for early detection of carcinoma. We presented a CBIR system considering lung nodule as pathology bearing region for differential diagnosis and training of budding radiologists. The performance of CBIR can be enhanced by improving the knowledge base through collecting more feedback from expert radiologists. The result is highly encouraging and the information was tested on SVM Classifier with RBF kernel obtained associate accuracy of 95.12%. A comparison of classification accuracy for ANN, KNN and SVM Classifiers was created on lung CT scan pictures of stage I and stage II. The results show that there are few miss-detections however overall potency of vision based potency measure is over 95%. By achieving this accuracy, lives of many patients with abnormal case of lung cancer can be saved.
VI. REFERENCES:
1. Cancer Facts and Figure 2009 by American Cancer Society, http://www.cancer.org
2. Anthony V D’Antoni, Genevieve Pinto Zipp, Valerie G Olson and Terrence F Cahill, “Does the mind map learning strategy facilitate information retrieval and critical thinking in medical students?", BMC Med Educ. 2010.
3. Stefan Diederich et al., “Screening for early lung cancer with low-dose spiral CT: prevalence in 817 asymptomatic smokers", Radiology, vol. 222, no.3, pp. 773-781, 2002.
4. Ichiro Yoshino, Masafumi Yamaguchi, Testuzo Tagawa, Seiichi Fukuyama, Toshifumi Kameyama, Atsushi Osoegawa and Yoshihiko Maehara, “Operative results of clinical stage I non-small cell lung cancer", Lung Cancer, vol. 42, no. 11, May 2003.
5. Austin et al., “Glossary of terms for CT of lungs; recommendations of the Nomenclature Committee of the Fleischner Society", Thoracic Radiology, vol. 200, pp. 327-331, April 1996.
6. Y. Kawata, N. Niki, H. Ohmatsu, M. Kusumoto, R. Kakinuma, K. Yamada, K. Mori, H. Nishiyama, K. Eguchi, M. Kaneko, and N. Moriyama, “Pulmonary nodule classification based on nodule retrieval from 3-D thoracic CT image database”, Medical Image Computing and Computer-Assisted Intervention (MICCAI 2004).
7. Michael O. Lam, Tim Disney, Daniela S. Raicu, Jacob Furst and David S. Channin, “BRISC-An Open Source Pulmonary Nodule Image Retrieval Framework", Journal of digital imaging, 2007.
8. Arimura, S. Katsuragawa and K. Suzuki, “Computerized scheme for automated detection of lung nodules in low-dose computed tomography images for lung cancer screening”, Acad. Radiol., Vol. 11, pp. 617629, 2004.
9. Ambrosini,S. Nicolini, P. Carolia, C. Nannia, A. Massarob, M.-C. Marzolab, D. Rubellob and S. Fantia, “PET/CT imaging in di_erent types of lung cancer: An overview”, European Journal of Radiology, Vol. 81, pp. 988-1001, 2013.
10. El-Bazl, A. Farag, R. Falk and R. LaRocca, Automatic identification of lung abnormalities in chest spiral CT scans, In proc. of the international conference on Acoustics, Speech, and Signal Processing (ICASSP '03), Vol.2, pp. 261-264, 2003.
11. El-Baz, A. Farag, G. Gimelfarb, R. Falk, M.-A. El-Ghar and T. Eldiasty, “A framework for automatic segmentation of lung nodules from low dose chest CT scans”, in Proc. of the 18th International Conference on Pattern Recognition (ICPR 06), Vol. 3, pp. 611614, 2006.
12. El-Baz, G. Gimelfarb, R. Falk and M. Abo El-Ghar, “3D MGRF-based appearance modelling for robust segmentation of pulmonary nodules in 3D LDCT chest images, in Lung Imaging and Computer Aided Diagnosis”, chapter 3, pp. 5163, Taylor and Francis edition, 2011.
13. J.-P. Kockelkorn, E.-M. Van Rikxoort, J.-C. Grutters and B. Van Ginneken, Interactive lung segmentation in CT scans with severe abnormalities, In Proc. of the 7th IEEE International Symposium on Biomedical Imaging: From Nano to Macro (ISBI '10), pp. 564567, 2010.
14. Ashwin, S.-A. Kumar, J. Ramesh and K. Gunavathi, “E_cient and Reliable Lung Nodule Detection using a Neural Network Based Computer Aided Diagnosis System”, In Proc. of the International Conference on Emerging Trends in Electrical Engineering and Energy Management (ICETEEEM'2012), pp. 135-142, Chennai, 13-15 Dec. 2012.
15. Farag, J Graham, A. Farag and R. Falk, “Lung Nodule Modelling A Data-Driven approach”, Advances in Visual Computing, Vo. 5875, pp 347-356, 2009.
16. Haussecker and B. Jahne, “A tensor approach for local structure analysis in multidimensional images in 3-D”, Image Anal. Synthesis, pp.171178, 1996.
17. Frangi, W. Niessen, K. Vincken, and M. Viergever, “Multiscale vessel enhancement filtering”, Med. Image Computing Computer Assisted Intervention, vol. 1496, pp. 130137, 1998
18. Zhao, A.-P. Reeves, D.-F. Yankelevitz and C.-I. Henschke, “Three-dimensional multicriterion automatic segmentation of pulmonary nodules of helical computed tomography images”, Optical Engineering, Vol. 38, No. 8, pp. 13401347,1999
Received on 04.09.2018 Modified on 03.10.2018
Accepted on 02.11.2018 © RJPT All right reserved
Research J. Pharm. and Tech 2019; 12(1): 62-66.
DOI: 10.5958/0974-360X.2019.00012.X